Gene Ontology (GO) Prediction using Machine Learning Methods
نویسندگان
چکیده
We applied machine learning to predict whether a gene is involved in axon regeneration. We extracted 31 features from different databases and trained five machine learning models. Our optimal model, a Random Forest Classifier with 50 submodels, yielded a test score of 85.71%, which is 4.1% higher than the baseline score. We concluded that our models have some predictive capability. Similar methodology and features could be applied to predict other Gene Ontology (GO) terms.
منابع مشابه
Discriminative local subspaces in gene expression data for effective gene function prediction
MOTIVATION Massive amounts of genome-wide gene expression data have become available, motivating the development of computational approaches that leverage this information to predict gene function. Among successful approaches, supervised machine learning methods, such as Support Vector Machines (SVMs), have shown superior prediction accuracy. However, these methods lack the simple biological in...
متن کاملThe effects of shared information on semantic calculations in the gene ontology
The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts th...
متن کاملGene Ontology-driven inference of protein-protein interactions using inducers
MOTIVATION Protein-protein interactions (PPIs) are pivotal for many biological processes and similarity in Gene Ontology (GO) annotation has been found to be one of the strongest indicators for PPI. Most GO-driven algorithms for PPI inference combine machine learning and semantic similarity techniques. We introduce the concept of inducers as a method to integrate both approaches more effectivel...
متن کاملFFPred: an integrated feature-based function prediction server for vertebrate proteomes
One of the challenges of the post-genomic era is to provide accurate function annotations for large volumes of data resulting from genome sequencing projects. Most function prediction servers utilize methods that transfer existing database annotations between orthologous sequences. In contrast, there are few methods that are independent of homology and can annotate distant and orphan protein se...
متن کاملTechnical supplement to “ Consistent probabilistic outputs for protein function prediction ”
Protein function prediction, in the context of the Gene Ontology, is a task that consists of answering, for a fixed protein X, a large number of binary questions of the form: " Does protein X belong to GO term Y ? " Those binary classification problems are strongly related because the ontology consists of nested classes. Two natural requirements for this prediction problem are • that the set of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.00001 شماره
صفحات -
تاریخ انتشار 2017